Take Home Exercise 2

A study on the distribution of Airbnb Listings in Singapore, and how has COVID-19 impacted them.

Teo Jun Peng https://teojp3-is415.netlify.app/
09-19-2021

1. Introduction

Singapore is one of the few countries that does not legalise short-term rentals for properties, a minimum stay of three months is still required for rental of properties as of 2019. Therefore, it was surprising to discover data sets for Airbnb listings located in Singapore in Inside Airbnb. But we shall dive deeper into this phenomenon of Airbnb being allowed in Singapore perhaps another day.

For this particular exercise, we will utilising those data sets to generate useful insights through:

1.1 The Data

For analysis purposes, the following data are used:

2. Setting up the environment

To begin the study, R packages will be used for efficiency and a more comprehensive analysis, such as tidyverse and sf etc.

packages <- c("readr", "maptools", "sf", "raster", "spatstat", "tmap", "tidyverse",
    "plotly", "ggthemes")
for (p in packages) {
    if (!require(p, character.only = T)) {
        install.packages(p)
    }
    library(p, character.only = T)
}

3. Data Wrangling

3.1 Aspatial Data

3.1.1 Importing in Aspatial Data

We will first import in data using the Airbnb (2019 and 2021) datasets, as we are interested in analysing the distribution of Airbnb listings Pre and Post COVID-19. For Pre-COVID context, the analysis aims to uncover whether the distribution of Airbnb listings are affected by location factors such as located near hotels or tourist attractions, so those datasets are imported in as well.

airbnb_2019 <- read_csv("_posts/2021-09-19-take-home-exercise-2/the2data/airbnb2019.csv")
airbnb_2021 <- read_csv("_posts/2021-09-19-take-home-exercise-2/the2data/airbnb2021.csv")
sg_hotels <- read_csv("_posts/2021-09-19-take-home-exercise-2/the2data/OneMapData/hotels.csv")
tourist_attractions <- read_csv("_posts/2021-09-19-take-home-exercise-2/the2data/OneMapData/tourism.csv")

We will also utilise an additional dataset from SLA OneMap, which is the Hawker Centres dataset. Singapore’s street food or known locally as hawker food culture has helped put the tiny country on the world map, with the hawker culture being featured in international food documentary series such as Netflix’s Street Food: Asia and also recently being recognised as a UNESCO heritage representative of Singapore.

Therefore, it would be interesting to see if the location of Airbnb listings would be located near more famous Hawker Centres, such as Maxwell Road Hawker Centre and Newton Food Centre etc.

The dataset will be imported using the One Map Api as shown in the below code chunk.

library(onemapsgapi)

hawker_centres <- get_theme("eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJzdWIiOjc5NTgsInVzZXJfaWQiOjc5NTgsImVtYWlsIjoianVucGVuZy50ZW8uMjAxOUBzbXUuZWR1LnNnIiwiZm9yZXZlciI6ZmFsc2UsImlzcyI6Imh0dHA6XC9cL29tMi5kZmUub25lbWFwLnNnXC9hcGlcL3YyXC91c2VyXC9zZXNzaW9uIiwiaWF0IjoxNjMxNzcyNzMzLCJleHAiOjE2MzIyMDQ3MzMsIm5iZiI6MTYzMTc3MjczMywianRpIjoiODc4OGY3NGI4Yjk5ZDZmZTI1NWFmNjExYjlkNTZiYjQifQ.N-KrWjJPHLJIQVwpBLjyDg6s0JWVNESINp7-7YjUEQI",
    "hawkercentre")

write.csv(hawker_centres, "_posts/2021-09-19-take-home-exercise-2/the2data/hawker_centres.csv",
    row.names = FALSE)

The hawker_centres dataset is written out as a csv for easier bookkeeping purposes

Importing in the hawker_centres from its csv file

hawker_centres <- read_csv(file = "_posts/2021-09-19-take-home-exercise-2/the2data/hawker_centres.csv")

3.1.2 Recalibrating Projected Coordinates Systems to SVY21

The coordinates in the respective data frames were defined in values of Latitude and Longitude, therefore being congruent with EPSG 4326. However, Singapore’s national Project Coordination Systems should be EPSG 3414 (SVY21) instead, therefore we will transform the crs to EPSG 3414.

airbnb_2019 <- st_as_sf(airbnb_2019, coords = c("longitude", "latitude"), crs = 4326) %>%
    st_transform(crs = 3414)

airbnb_2021 <- st_as_sf(airbnb_2021, coords = c("longitude", "latitude"), crs = 4326) %>%
    st_transform(crs = 3414)

sg_hotels <- st_as_sf(sg_hotels, coords = c("Lng", "Lat"), crs = 4326) %>%
    st_transform(crs = 3414)

hawker_centres <- st_as_sf(hawker_centres, coords = c("Lng", "Lat"), crs = 4326) %>%
    st_transform(crs = 3414)

For tourist attractions, an extra step is needed because there are NA values, so we have to drop them. If not, we will get an error “missing values in coordinates not allowed”.

tourist_attractions <- tourist_attractions[!is.na(tourist_attractions$LATITUDE),
    ]
tourist_attractions <- st_as_sf(tourist_attractions, coords = c("LONGTITUDE", "LATITUDE"),
    crs = 4326) %>%
    st_transform(crs = 3414)

Checking and ensuring Projected Coordinates Systems of the sf objects is correct

st_crs(airbnb_2019)
st_crs(airbnb_2021)
st_crs(sg_hotels)
st_crs(hawker_centres)
st_crs(tourist_attractions)

3.2 Spatial Data

3.2.1 Importing in Geospatial Data

Importing shapefile of MRT Stations locations using st_read()* of sf package. The output object will be in tibble sf object class.

We will be looking at the correlation between location of Airbnb listings and MRT stations in Singapore.

mrt_sf <- st_read(dsn = "_posts/2021-09-19-take-home-exercise-2/the2data/MRT", layer = "MRTLRTStnPtt")
Reading layer `MRTLRTStnPtt' from data source 
  `C:\teojp3\IS415_blog\_posts\2021-09-19-take-home-exercise-2\the2data\MRT' 
  using driver `ESRI Shapefile'
Simple feature collection with 171 features and 3 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 6138.311 ymin: 27555.06 xmax: 45254.86 ymax: 47854.2
Projected CRS: SVY21

The output of the above code chunk showed that projection of mrt_sf is in SVY21, so we are good to proceed on because it is the correct projected coordinates system.

3.3 Geospatial Data Wrangling

3.3.1 Converting sf data frames into sp’s Spatial* class

The code chunk below uses as_Spatial() of sf package to convert the respective geospatial data from simple feature data frame to sp’s Spatial* class.

airbnb_2019_sp <- as_Spatial(airbnb_2019)
airbnb_2021_sp <- as_Spatial(airbnb_2021)
hotels_sp <- as_Spatial(sg_hotels)
hawker_sp <- as_Spatial(hawker_centres)
tourist_attractions_sp <- as_Spatial(tourist_attractions)
mrt_sp <- as_Spatial(mrt_sf)

Taking a look at a chosen few of the above data frames to ensure that they are converted to Spatial* classes successfully.

airbnb_2019_sp
class       : SpatialPointsDataFrame 
features    : 8293 
extent      : 7215.566, 44098.31, 25166.35, 49226.35  (xmin, xmax, ymin, ymax)
crs         : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
variables   : 14
names       :       id,                                              name,   host_id,                host_name, neighbourhood_group, neighbourhood,       room_type, price, minimum_nights, number_of_reviews, last_review, reviews_per_month, calculated_host_listings_count, availability_365 
min values  :    49091,                                                 -,     23666, (Email hidden by Airbnb),      Central Region,    Ang Mo Kio, Entire home/apt,     0,              1,                 0,       15656,              0.01,                              1,                0 
max values  : 36053005, ZR2- NEW! Sunny & Modern Apt 4 mins to Orchard Rd, 271165196,                   Zuzana,         West Region,        Yishun,     Shared room, 13999,           1000,               308,       18072,             12.09,                            277,              365 
hotels_sp
class       : SpatialPointsDataFrame 
features    : 422 
extent      : 5939.241, 45334.18, 25379.44, 44562.4  (xmin, xmax, ymin, ymax)
crs         : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
variables   : 7
names       :                              NAME, ADDRESSPOSTALCODE,                                 ADDRESSSTREETNAME,         HYPERLINK, TOTALROOMS,     KEEPERNAME, ICON_NAME 
min values  :                      30 BENCOOLEN,             18956,                                 1 Bayfront Avenue, 96ytlim@gmail.com,          4,  Adel Aramouni, hotel.gif 
max values  : YotelAir Singapore Changi Airport,            819666, 99 IRRAWADDY ROAD, # 22-00 ROYAL SQUARE AT NOVENA, zubair@dam.com.sg,       2561, Zhang YuanQing, hotel.gif 

3.3.2 Converting the Spatial* classes into generic sp format

This conversion is required because spatstat only takes in analytical data in ppp object form.

The below code chunks converts the Spatial* classes into generic sp objects.

airbnb_2019_sp <- as(airbnb_2019_sp, "SpatialPoints")
airbnb_2021_sp <- as(airbnb_2021_sp, "SpatialPoints")
hotels_sp <- as(hotels_sp, "SpatialPoints")
hawker_sp <- as(hawker_sp, "SpatialPoints")
tourist_attractions_sp <- as(tourist_attractions_sp, "SpatialPoints")
mrt_sp <- as(mrt_sp, "SpatialPoints")

Taking a look at a chosen few of the above Spatial* classes to ensure that they are converted to generic sp objects successfully.

hawker_sp
class       : SpatialPoints 
features    : 125 
extent      : 12874.19, 45241.4, 28355.97, 47872.53  (xmin, xmax, ymin, ymax)
crs         : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
mrt_sp
class       : SpatialPoints 
features    : 171 
extent      : 6138.311, 45254.86, 27555.06, 47854.2  (xmin, xmax, ymin, ymax)
crs         : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +datum=WGS84 +units=m +no_defs 

3.3.3 Converting the generic sp objects into spatstat’s ppp format

In this step, we will use as.ppp() function of spatstat to convert the spatial data into spatstat’s ppp object format.

airbnb_2019_ppp <- as(airbnb_2019_sp, "ppp")
airbnb_2021_ppp <- as(airbnb_2021_sp, "ppp")
hotels_ppp <- as(hotels_sp, "ppp")
hawker_ppp <- as(hawker_sp, "ppp")
tourist_attractions_ppp <- as(tourist_attractions_sp, "ppp")
mrt_ppp <- as(mrt_sp, "ppp")
plot(airbnb_2019_ppp)

summary(airbnb_2019_ppp)
Planar point pattern:  8293 points
Average intensity 9.345289e-06 points per square unit

*Pattern contains duplicated points*

Coordinates are given to 3 decimal places
i.e. rounded to the nearest multiple of 0.001 units

Window: rectangle = [7215.57, 44098.31] x [25166.35, 49226.35] units
                    (36880 x 24060 units)
Window area = 887399000 square units

3.3.4 Handling duplicated points in ppp objects

Duplicated points are commonly found in ppp objects as shown in the output above, but the presence of duplicates will affect the accuracy of the spatial point patterns analysis, therefore we have to handle these duplicates in this section.

We will use the jittering approach, which adds a small perturbation to the duplicate points so they would not occupy the exact same space and point as their originals.

airbnb2019_ppp_jit <- rjitter(airbnb_2019_ppp, retry = TRUE, nsim = 1, drop = TRUE)

airbnb2021_ppp_jit <- rjitter(airbnb_2021_ppp, retry = TRUE, nsim = 1, drop = TRUE)

hawker_ppp_jit <- rjitter(hawker_ppp, retry = TRUE, nsim = 1, drop = TRUE)

tourism_ppp_jit <- rjitter(tourist_attractions_ppp, retry = TRUE, nsim = 1, drop = TRUE)

hotels_ppp_jit <- rjitter(hotels_ppp, retry = TRUE, nsim = 1, drop = TRUE)

mrt_ppp_jit <- rjitter(mrt_ppp, retry = TRUE, nsim = 1, drop = TRUE)
any(duplicated(airbnb2019_ppp_jit))
[1] FALSE

3.3.5 Creating an owin object

When doing spatial point patterns analysis, a good practice to always follow is to confine the analysis with a geographical area such as Singapore’s boundary. In spatstat, this polygonal region is known as owin, an object specially designed to represent this boundary.

sg <- st_read(dsn = "_posts/2021-09-19-take-home-exercise-2/the2data/sgdata", layer = "CostalOutline")
Reading layer `CostalOutline' from data source 
  `C:\teojp3\IS415_blog\_posts\2021-09-19-take-home-exercise-2\the2data\sgdata' 
  using driver `ESRI Shapefile'
Simple feature collection with 60 features and 4 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 2663.926 ymin: 16357.98 xmax: 56047.79 ymax: 50244.03
Projected CRS: SVY21
sg <- st_transform(sg, crs = 3414)

sg_sp <- as(as_Spatial(sg), "SpatialPolygons")

sg <- st_make_valid(sg)

The code chunk below is used to convert sg SpatialPolygon object into owin object of spatstat.

sg_owin <- as(sg_sp, "owin")

The output object combined both the point and polygon feature in one ppp object class as shown below.

airbnb_2019_ppp = airbnb2019_ppp_jit[sg_owin]

plot(airbnb_2019_ppp)

4. Geospatial Analysis

4.1 Section A: Distribution of Airbnb Listings in 2019

4.1.1 Exploratory Spatial Data Analysis

tmap_mode("plot")
tm_shape(airbnb_2019) + tm_dots(alpha = 0.4, col = "blue", size = 0.05) + tm_shape(sg_hotels) +
    tm_dots(alpha = 0.4, col = "red", size = 0.05) + tm_shape(tourist_attractions) +
    tm_dots(alpha = 0.4, col = "green", size = 0.05)

4.1.2 First-order Spatial Point Patterns Analysis

First-order Spatial Point Patterns Analysis (SPPA) will be performed using spatstat package.

Kernel Density Maps will be used to unravel spatial patterns between the location of listings and various location factors mentioned earlier on. The advantage of using Kernel Density over Point Density is that the results are much more spatially accurate, because Point Density usually produces more steep edges which is not desirable.

On the other hand, Kernel density gives smoother results that allows for plotting nicer visuals, which is crucial for getting substantial analysis insights.

Visual comparison of Point Density vs Kernel Density #### Kernel Density Map of Airbnb Listings

# kde_airbnb2019_bw_original <- density(airbnb_2019_ppp, sigma=bw.diggle,
# edge=TRUE, kernel='gaussian')

# kde_airbnb2019_bw <- density(airbnb2019_ppp_owin, sigma=bw.diggle, edge=TRUE,
# kernel='gaussian') gridded_kde_airbnb2019_bw <-
# as.SpatialGridDataFrame.im(kde_airbnb2019_bw)
# spplot(gridded_kde_airbnb2019_bw, main='Airbnb Listings')

# bw <- bw <- bw.diggle(airbnb_2019_ppp) bw
kde_airbnb2019_bw <- density(airbnb_2019_ppp, sigma = bw.diggle, edge = TRUE, kernel = "gaussian")
plot(kde_airbnb2019_bw, main = "Airbnb Listings")

# Convert units to km
kde_airbnb2019.km <- rescale(airbnb_2019_ppp, 1000, "km")

kde_airbnb2019.bw <- density(kde_airbnb2019.km, sigma = 1, edge = TRUE, kernel = "gaussian")
plot(kde_airbnb2019.bw, main = "Airbnb Listings")

gridded_kde_airbnb2019_bw <- as.SpatialGridDataFrame.im(kde_airbnb2019.bw)
spplot(gridded_kde_airbnb2019_bw, main = "Airbnb Listings")

kde_airbnb_2019_bw_raster <- raster(gridded_kde_airbnb2019_bw)
kde_airbnb_2019_bw_raster
class      : RasterLayer 
dimensions : 128, 128, 16384  (nrow, ncol, ncell)
resolution : 0.4170614, 0.2647348  (x, y)
extent     : 2.663926, 56.04779, 16.35798, 50.24403  (xmin, xmax, ymin, ymax)
crs        : NA 
source     : memory
names      : v 
values     : -2.135658e-14, 211.9649  (min, max)
projection(kde_airbnb_2019_bw_raster) <- CRS("+init=EPSG:3414 +datum=WGS84 +units=km")

kde_airbnb_2019_bw_raster
class      : RasterLayer 
dimensions : 128, 128, 16384  (nrow, ncol, ncell)
resolution : 0.4170614, 0.2647348  (x, y)
extent     : 2.663926, 56.04779, 16.35798, 50.24403  (xmin, xmax, ymin, ymax)
crs        : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +datum=WGS84 +units=km +no_defs 
source     : memory
names      : v 
values     : -2.135658e-14, 211.9649  (min, max)
tmap_mode("view")

tm_shape(sg) + tm_borders(col = "black", lwd = 1, alpha = 0.5) + tm_shape(kde_airbnb_2019_bw_raster) +
    tm_raster("v", alpha = 0.7) + tm_layout(legend.position = c("right", "bottom"),
    frame = FALSE, title = "Airbnb Listings 2019") + tm_basemap("OpenStreetMap")
plot(kde_airbnb2019_bw)

tmap_mode("view")
tm_shape(airbnb_2019) + tm_dots(alpha = 0.4, col = "blue", size = 0.05) + tm_shape(sg_hotels) +
    tm_dots(alpha = 0.4, col = "red", size = 0.05) + tm_shape(tourist_attractions) +
    tm_dots(alpha = 0.4, col = "green", size = 0.05)
tmap_mode("view")
tm_shape(airbnb_2021) + tm_dots(alpha = 0.4, col = "blue", size = 0.05)

4.2 Section B: Impact of COVID-19 on Distribution of Airbnb Listings in 2021

5. Conclusion

6. References